{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "    <p style=\"text-align:center\">\n",
    "        <img alt=\"phoenix logo\" src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg\" width=\"200\"/>\n",
    "        <br>\n",
    "        <a href=\"https://arize.com/docs/phoenix/\">Docs</a>\n",
    "        |\n",
    "        <a href=\"https://github.com/Arize-ai/phoenix\">GitHub</a>\n",
    "        |\n",
    "        <a href=\"https://arize-ai.slack.com/join/shared_invite/zt-2w57bhem8-hq24MB6u7yE_ZF_ilOYSBw#/shared-invite/email\">Community</a>\n",
    "    </p>\n",
    "</center>\n",
    "<h1 align=\"center\">Image Classification with Phoenix</h1>\n",
    "\n",
    "In this tutorial, you will:\n",
    "- Upload a dataset of images to Phoenix\n",
    "- Create an experiment to classify the images and measure the accuracy of the model you use\n",
    "- View images in the Phoenix UI\n",
    "\n",
    "ℹ️ This notebook requires an OpenAI API key."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Install dependencies"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "pip install -q \"arize-phoenix>=4.29.0\" openinference-instrumentation-openai openai datasets 'httpx<0.28'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Connect to Phoenix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Check if PHOENIX_API_KEY is present in the environment variables.\n",
    "# If it is, we'll use the cloud instance of Phoenix. If it's not, we'll start a local instance.\n",
    "# A third option is to connect to a docker or locally hosted instance.\n",
    "# See https://arize.com/docs/phoenix/setup/environments for more information.\n",
    "\n",
    "import os\n",
    "\n",
    "if \"PHOENIX_API_KEY\" in os.environ:\n",
    "    os.environ[\"PHOENIX_CLIENT_HEADERS\"] = f\"api_key={os.environ['PHOENIX_API_KEY']}\"\n",
    "    os.environ[\"PHOENIX_COLLECTOR_ENDPOINT\"] = \"https://app.phoenix.arize.com\"\n",
    "\n",
    "else:\n",
    "    import phoenix as px\n",
    "\n",
    "    px.launch_app()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from phoenix.otel import register\n",
    "\n",
    "tracer_provider = register()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load dataset of test cases"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from datasets import load_dataset\n",
    "\n",
    "import phoenix as px\n",
    "\n",
    "df = load_dataset(\"huggingface/image-classification-test-sample\")[\"train\"].to_pandas()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We first need to convert the image data to a base64 encoded string. Phoenix expects the image data to be in either this format or as a URL."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import base64\n",
    "\n",
    "# Extract the bytes object from the dictionary and update the 'img' column\n",
    "df[\"img\"] = df[\"img\"].apply(lambda x: x[\"bytes\"])\n",
    "\n",
    "# Base64 encode the value in 'img' column\n",
    "df[\"img\"] = df[\"img\"].apply(lambda x: base64.b64encode(x).decode(\"utf-8\"))\n",
    "\n",
    "\n",
    "# Append 'data:image/png;base64,' to the beginning of each value in the 'image' column\n",
    "df[\"img\"] = df[\"img\"].apply(lambda x: \"data:image/png;base64,\" + x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, let's map the numerical labels to the actual labels. This will make it easier to compare the model's output to the expected output."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "label_map = {\n",
    "    1: \"automobile\",\n",
    "    2: \"snakes\",\n",
    "    3: \"cat\",\n",
    "    4: \"tree\",\n",
    "    5: \"dog\",\n",
    "    6: \"frog\",\n",
    "    7: \"horse\",\n",
    "    8: \"ship\",\n",
    "}\n",
    "df[\"label\"] = df[\"label\"].map(label_map)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our dataset is now ready to upload to Phoenix. Let's take a look at the first image in the dataset just to make sure it looks right."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import Image, display\n",
    "\n",
    "# Get the image data from the first row\n",
    "image_data = df.loc[0, \"img\"]\n",
    "\n",
    "# Remove the data URI prefix\n",
    "image_data = image_data.split(\",\")[1]\n",
    "\n",
    "# Decode the base64 string\n",
    "image_bytes = base64.b64decode(image_data)\n",
    "\n",
    "# Display the image\n",
    "display(Image(data=image_bytes))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From here, we can upload the dataset to Phoenix. This dataset will act as our test cases for the experiment we'll run later."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import datetime\n",
    "\n",
    "test_cases = px.Client().upload_dataset(\n",
    "    dataset_name=f\"image-classification-test-sample-{datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S')}\",\n",
    "    dataframe=df,\n",
    "    input_keys=[\"img\"],\n",
    "    output_keys=[\"label\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create our experiment task"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH\"] = (\n",
    "    \"10000000000\"  # this ensures that the image data is not truncated\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll be using OpenAI's GPT-4o-mini model to classify the images. Let's make sure we'll be able to properly see all the traces generated by this model by instrumenting it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openinference.instrumentation.openai import OpenAIInstrumentor\n",
    "\n",
    "OpenAIInstrumentor().instrument(tracer_provider=tracer_provider, skip_dep_check=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We also need to set the OpenAI API key in the environment variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import getpass\n",
    "\n",
    "if \"OPENAI_API_KEY\" not in os.environ:\n",
    "    os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter your OpenAI API key: \")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can define our task. This task will take an image and use GPT-4o-mini to classify it based on the labels we uploaded."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openai import OpenAI\n",
    "\n",
    "\n",
    "def task(input):\n",
    "    client = OpenAI()\n",
    "\n",
    "    response = client.chat.completions.create(\n",
    "        model=\"gpt-4o-mini\",\n",
    "        messages=[\n",
    "            {\n",
    "                \"role\": \"user\",\n",
    "                \"content\": [\n",
    "                    {\n",
    "                        \"type\": \"text\",\n",
    "                        \"text\": \"What’s in this image? Your answer should be a single word. The word should be one of the following: \"\n",
    "                        + str(label_map.values()),\n",
    "                    },\n",
    "                    {\n",
    "                        \"type\": \"image_url\",\n",
    "                        \"image_url\": {\n",
    "                            \"url\": input[\"img\"],\n",
    "                        },\n",
    "                    },\n",
    "                ],\n",
    "            }\n",
    "        ],\n",
    "        max_tokens=300,\n",
    "    )\n",
    "\n",
    "    output_label = response.choices[0].message.content.lower()\n",
    "    return output_label"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create our evaluators"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we have to set up our evaluators. In this case, our evaluators are very simple, because we have ground truth labels for each image. All we need to do is check if the model's output matches the expected output."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def matches_expected_label(expected, output):\n",
    "    return expected[\"label\"] == output"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Run our experiment"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With that, we're ready to run our experiment. This function will run each row of our test_cases dataset through our task function and evaluate the output using our evaluators. Results will be displayed below, and uploaded to the Phoenix UI."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "from phoenix.experiments import run_experiment\n",
    "\n",
    "nest_asyncio.apply()\n",
    "\n",
    "run_experiment(\n",
    "    task=task,\n",
    "    evaluators=[matches_expected_label],\n",
    "    dataset=test_cases,\n",
    "    experiment_description=\"Image classification experiment\",\n",
    "    experiment_metadata={\"model\": \"gpt-4o\"},\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you're running this in Colab, you can view the experiment by clicking on the URL in the cell below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if px.active_session():\n",
    "    px.active_session().view()"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
